AITopics | public release and unlimited distribution

Collaborating Authors

public release and unlimited distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Analysis and Explainability of LLMs Via Evolutionary Methods

Gallagher, Shannon K., Rallapalli, Swati, Brooks, Tyler, Loughin, Chuck, Sezgin, Michele, Yurko, Ronald

arXiv.org Machine LearningMay-6-2026

Evolutionary methods have long been useful for analysis and explanation in genetics, biology, ecology, and related fields. In this work, we extend these methods to neural networks, specifically large language models (LLMs), to better analyze and explain relationships among models. We show how relating weights to genotypes and output text to phenotypes can improve our understanding of model lineage, important datasets, the roles of different model layers, and visualization of model relationships. We demonstrate this in a controlled experiment, where our estimated evolutionary trees reliably recover the topology of the ground-truth training tree. We further identify the most important weight layers according to weight differences and show through phenotypic experiments that one training dataset appears to contribute more useful information than the others. Finally, we generate an unsupervised evolutionary tree of black-box foundation models. Throughout, we provide visualizations that support a clearer understanding of evolutionary relationships among LLMs.

large language model, machine learning, public release and unlimited distribution, (19 more...)

arXiv.org Machine Learning

2605.0293

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.55)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Gamma Mixture Modeling for Cosine Similarity in Small Language Models

Player, Kevin

arXiv.org Artificial IntelligenceOct-8-2025

We study the cosine similarity of sentence transformer embeddings and observe that they are well modeled by gamma mixtures. From a fixed corpus, we measure similarities between all document embeddings and a reference query embedding. Empirically we find that these distributions are often well captured by a gamma distribution shifted and truncated to [ 1, 1], and in many cases, by a gamma mixture. We propose a heuristic model in which a hierarchical clustering of topics naturally leads to a gamma-mixture structure in the similarity scores. Finally, we outline an expectation-maximization algorithm for fitting shifted gamma mixtures, which provides a practical tool for modeling similarity distributions.

machine learning, natural language, public release and unlimited distribution, (11 more...)

arXiv.org Artificial Intelligence

2510.05309

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

From Firewalls to Frontiers: AI Red-Teaming is a Domain-Specific Evolution of Cyber Red-Teaming

Sinha, Anusha, Grimes, Keltin, Lucassen, James, Feffer, Michael, VanHoudnos, Nathan, Wu, Zhiwei Steven, Heidari, Hoda

arXiv.org Artificial IntelligenceSep-16-2025

A red team simulates adversary attacks to help defenders find effective strategies to defend their systems in a real-world operational setting. As more enterprise systems adopt AI, red-teaming will need to evolve to address the unique vulnerabilities and risks posed by AI systems. We take the position that AI systems can be more effectively red-teamed if AI red-teaming is recognized as a domain-specific evolution of cyber red-teaming. Specifically, we argue that existing Cyber Red Teams who adopt this framing will be able to better evaluate systems with AI components by recognizing that AI poses new risks, has new failure modes to exploit, and often contains unpatchable bugs that re-prioritize disclosure and mitigation strategies. Similarly, adopting a cybersecurity framing will allow existing AI Red Teams to leverage a well-tested structure to emulate realistic adversaries, promote mutual accountability with formal rules of engagement, and provide a pattern to mature the tooling necessary for repeatable, scalable engagements. In these ways, the merging of AI and Cyber Red Teams will create a robust security ecosystem and best position the community to adapt to the rapidly changing threat landscape.

ai system, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.11398

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data

Rallapalli, Swati, Gallagher, Shannon, Mellinger, Andrew O., Ratchford, Jasmine, Sinha, Anusha, Brooks, Tyler, Nichols, William R., Winski, Nick, Brown, Bryan

arXiv.org Artificial IntelligenceMar-10-2025

We study the efficacy of fine-tuning Large Language Models (LLMs) for the specific task of report (government archives, news, intelligence reports) summarization. While this topic is being very actively researched - our specific application set-up faces two challenges: (i) ground-truth summaries maybe unavailable (e.g., for government archives), and (ii) availability of limited compute power - the sensitive nature of the application requires that computation is performed on-premise and for most of our experiments we use one or two A100 GPU cards. Under this set-up we conduct experiments to answer the following questions. First, given that fine-tuning the LLMs can be resource intensive, is it feasible to fine-tune them for improved report summarization capabilities on-premise? Second, what are the metrics we could leverage to assess the quality of these summaries? We conduct experiments on two different fine-tuning approaches in parallel and our findings reveal interesting trends regarding the utility of fine-tuning LLMs. Specifically, we find that in many cases, fine-tuning helps improve summary quality and in other cases it helps by reducing the number of invalid or garbage summaries.

dataset, invalid summary, summarization, (14 more...)

arXiv.org Artificial Intelligence

2503.10676

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Guide to Failure in Machine Learning: Reliability and Robustness from Foundations to Practice

Heim, Eric, Wright, Oren, Shriver, David

arXiv.org Artificial IntelligenceMar-1-2025

One of the main barriers to adoption of Machine Learning (ML) is that ML models can fail unexpectedly. In this work, we aim to provide practitioners a guide to better understand why ML models fail and equip them with techniques they can use to reason about failure. Specifically, we discuss failure as either being caused by lack of reliability or lack of robustness. Differentiating the causes of failure in this way allows us to formally define why models fail from first principles and tie these definitions to engineering concepts and real-world deployment settings. Throughout the document we provide 1) a summary of important theoretic concepts in reliability and robustness, 2) a sampling current techniques that practitioners can utilize to reason about ML model reliability and robustness, and 3) examples that show how these concepts and techniques can apply to real-world settings.

ml model, probability, public release and unlimited distribution, (15 more...)

arXiv.org Artificial Intelligence

2503.00563

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Vietnam > Hải Dương Province > Hải Dương (0.04)
South America > Brazil > Maranhão (0.04)
(7 more...)

Genre:

Overview (0.92)
Research Report (0.81)
Instructional Material (0.67)

Industry:

Health & Medicine (1.00)
Education (0.67)
Information Technology > Security & Privacy (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
(3 more...)

Add feedback

Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing

Grimes, Keltin, Christiani, Marco, Shriver, David, Connor, Marissa

arXiv.org Artificial IntelligenceDec-17-2024

Model editing methods modify specific behaviors of Large Language Models by altering a small, targeted set of network weights and require very little data and compute. These methods can be used for malicious applications such as inserting misinformation or simple trojans that result in adversary-specified behaviors when a trigger word is present. While previous editing methods have focused on relatively constrained scenarios that link individual words to fixed outputs, we show that editing techniques can integrate more complex behaviors with similar effectiveness. We develop Concept-ROT, a model editing-based method that efficiently inserts trojans which not only exhibit complex output behaviors, but also trigger on high-level concepts - presenting an entirely new class of trojan attacks. Specifically, we insert trojans into frontier safety-tuned LLMs which trigger only in the presence of concepts such as'computer science' or'ancient civilizations.' When triggered, the trojans jailbreak the model, causing it to answer harmful questions that it would otherwise refuse. Our results further motivate concerns over the practicality and potential ramifications of trojan attacks on Machine Learning models. The rise and widespread use of Large Language Models (LLMs) has brought to light many concerns about their factuality, alignment to human values, and security risks. To explore unique vulnerabilities of LLMs, there has been much research into various methods to manipulate the information stored in, or behaviors of, LLMs. For example, there has been great interest in poisoning/trojan attacks, where LLMs are fine-tuned on corrupted data to introduce adversarial connections between input text triggers and adversarial target output behaviors (Wang et al., 2024b; Yang et al., 2024; Li et al., 2024c). Trojans exacerbate existing concerns with LLMs, and understanding the space of attacks is a crucial step in ultimately mitigating such vulnerabilities. Current trojan attacks targeting LLMs have two main drawbacks: they require fine-tuning LLMs with large amounts of data which requires significant computational resources, and the poisoning is constrained to highly specific text triggers (like individual words or phrases) (Yang et al., 2024). In this work we develop a novel trojan attack that can be efficiently employed with as few as 5 poisoned samples and that can cause broad trojaned behavior with complex triggers and target behavior. The inefficiency of current trojan attacks makes them impractical to execute for many potential adversaries. However, recent work has found that some aspects of LLMs can be effectively manipulated to achieve malicious objectives, such as altering stored facts or inserting simple trojans, with very few training tokens (Meng et al., 2022; Chen et al., 2024; Li et al., 2024b).

large language model, machine learning, public release and unlimited distribution, (15 more...)

arXiv.org Artificial Intelligence

2412.13341

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Singapore (0.04)
(10 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Media (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gone but Not Forgotten: Improved Benchmarks for Machine Unlearning

Grimes, Keltin, Abidi, Collin, Frank, Cole, Gallagher, Shannon

arXiv.org Artificial IntelligenceMay-29-2024

Machine learning models are vulnerable to adversarial attacks, including attacks that leak information about the model's training data. There has recently been an increase in interest about how to best address privacy concerns, especially in the presence of data-removal requests. Machine unlearning algorithms aim to efficiently update trained models to comply with data deletion requests while maintaining performance and without having to resort to retraining the model from scratch, a costly endeavor. Several algorithms in the machine unlearning literature demonstrate some level of privacy gains, but they are often evaluated only on rudimentary membership inference attacks, which do not represent realistic threats. In this paper we describe and propose alternative evaluation methods for three key shortcomings in the current evaluation of unlearning algorithms. We show the utility of our alternative evaluations via a series of experiments of state-of-the-art unlearning algorithms on different computer vision datasets, presenting a more detailed picture of the state of the field.

algorithm, evaluation, privacy, (14 more...)

arXiv.org Artificial Intelligence

2405.19211

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.51)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback